[ana6, com8]: Add ana6Optimisation Module, apply changes in com8MoTPSA#1245
[ana6, com8]: Add ana6Optimisation Module, apply changes in com8MoTPSA#1245RolandFischbacher wants to merge 22 commits intomasterfrom
Conversation
❌ 1 blocking issue (1 total)
@qltysh one-click actions:
|
f66f702 to
38e42ca
Compare
Squash of 20 commits from RF_com8MoTPSA branch including: - com8MoTPSA workflow improvements (chunked multiprocessing, path handling) - Bayesian optimisation integration (ana6Optimisation module) - Morris sensitivity analysis scripts - AIMEC runout reference implementation - probAna pickle saving and bounds - Plotting and config improvements
|
Coverage Impact ⬇️ Merging this pull request will decrease total coverage on Modified Components (1)
Modified Files with Diff Coverage (2)
🤖 Increase coverage with AI coding...🚦 See full report on Qlty Cloud » 🛟 Help
|
- Add bounds to paramValuesD in createSamplesWithVariation (StandardParameters) - Add writing of visualisation scenario and sampling method to com8MoTPSACfg.ini
| if 'VISUALISATION' in config.sections(): | ||
| # config is inifile | ||
| index = config['VISUALISATION']['scenario'] | ||
|
|
||
| if 'VISUALISATION' in config.sections(): | ||
| # config is inifile | ||
| index = config['VISUALISATION']['scenario'] | ||
| if 'sampleMethod' in config['VISUALISATION']: | ||
| sampleMethod = config['VISUALISATION']['sampleMethod'] |
There was a problem hiding this comment.
if any .ini file in the config directory lacks a VISUALISATION section, the variables index and sampleMethod are never assigned, but they are unconditionally appended on lines 107-108. On the first such file, this raises UnboundLocalError. Even if subsequent iterations reuse a stale value from a previous file, the data would be silently wrong.
There was a problem hiding this comment.
Thank you for the info. I will initialize index and sampleMethod with np.nan to avoid UnboundLocalError and stale values.
avaframe/ana4Stats/probAna.py
Outdated
| if modName.lower() in ["com1dfa", "com5snowslide", "com6rockavalanche", 'com8motpsa']: | ||
| cfgStart["VISUALISATION"]["scenario"] = str(count1) | ||
| cfgStart["INPUT"]["thFromIni"] = paramValuesD["thFromIni"] | ||
| cfgStart["VISUALISATION"]["sampleMethod"] = cfg['PROBRUN']['sampleMethod'] |
There was a problem hiding this comment.
The new line cfgStart["VISUALISATION"]["sampleMethod"] = cfg['PROBRUN']['sampleMethod'] reads sampleMethod from cfg['PROBRUN'].
However, probAnaCfg.ini has sampleMethod under [PROBRUN] only when probAna is the caller. If createCfgFiles is called from a different path where cfg doesn't have PROBRUN.sampleMethod, this will raise KeyError. The code also assumes VISUALISATION section exists in cfgStart for com8MoTPSA — while the new com8MoTPSACfg.ini does add it, there's no sampleMethod default there, making the flow dependent on the caller always providing this key.
There was a problem hiding this comment.
Added check if 'VISUALISATION' exists in cfgStart, if not, it will be added to cfgStart.
And read sample method with fallback; meaning if 'PROBANA' or 'sampleMethod' is missing sample_method contains an empty string.
…i file, delete redundant code
…Files in optimisationUtils.py), add possibility to RUN writeCfgFiles with counter
…clean header Implement modName to make code general and remove adjustText package
…ror and stale values and remove copy paste error.
… added to cfgStart. And read sample method with fallback; meaning if 'PROBANA' or 'sampleMethod' is missing sample_method contains an empty string
… docstring, tidy up code, add if __name__='__main__': in runPlotMorrisConvergence.py and improve BOConvergencePlot.
…names from paper to com8MoTPSACfg.ini.
awirb
left a comment
There was a problem hiding this comment.
first comments, more to come
| # check for allConfigurationsInfo to find computation info and add to info fetched from ini files | ||
| if latest == False and isinstance(simDF, pd.DataFrame): | ||
| # check if in allConfigurationsInfo also info for existing sims | ||
| simDFALL, _ = readAllConfigurationInfo(avaDir, specDir="", configCsvName="allConfigurations") |
There was a problem hiding this comment.
would you also require here to set the modName variable that it doesn't just read the allConfigurations.csv from the Outputs of com1DFA?
There was a problem hiding this comment.
allConfigurations.csv is currently not available for com8MoTPSA module
| @@ -0,0 +1,64 @@ | |||
| ### Config File - This file contains the main settings for the optimisation process | |||
|
|
|||
| # Sidenote: (1) when running runOptimisation.py the working directory needs to be in the ana6Optimisation folder | |||
There was a problem hiding this comment.
why is it actually located here and not in the runScripts directory?
|
|
||
|
|
||
| [GENERAL] | ||
| # USER input for running and plotting a comparison of simulation result to reference polygon |
There was a problem hiding this comment.
does this refer to the settings in runPlotAreaProfile?
There was a problem hiding this comment.
Yes this refers to the settings in runPlotAreaRefDiffs.py
| [PARAM_BOUNDS] | ||
| # 2 scenarios: choose 1 or 2 | ||
| scenario = 1 | ||
| #(1): morris is run prior, then dataframe of ranked input parameters is already saved by runMorris.py as pickle file, user only need to determine how much input paramters to use for optimisation |
There was a problem hiding this comment.
| #(1): morris is run prior, then dataframe of ranked input parameters is already saved by runMorris.py as pickle file, user only need to determine how much input paramters to use for optimisation | |
| #(1): morris is run prior, then dataframe of ranked input parameters is already saved by runMorris.py as pickle file, user only needs to determine how much input parameters to use for optimisation |
There was a problem hiding this comment.
Applied locally
| # 2 scenarios: choose 1 or 2 | ||
| scenario = 1 | ||
| #(1): morris is run prior, then dataframe of ranked input parameters is already saved by runMorris.py as pickle file, user only need to determine how much input paramters to use for optimisation | ||
| topN = 3 |
There was a problem hiding this comment.
what is this parameter? add description
There was a problem hiding this comment.
and move description of second scenario up
There was a problem hiding this comment.
also this is just used if scenario 1 using the Morris analysis parameters and there the topN ranked ones right? now I see why the (2) scenario is explained below
| Contains Cropshape and defines the maximal extent of runout area that is used for calculating areal indicators. | ||
|
|
||
| - **REFDATA** | ||
| Defines the runout area of the reference event. |
There was a problem hiding this comment.
add info that this needs to have the suffix _POLY.shp (if polygon is required)
| - **Digital Elevation Model (DEM)** | ||
| Must be placed directly in the `Inputs` directory and must cover the entire affected area. | ||
|
|
||
| More Details here: https://docs.avaframe.org/en/latest/moduleCom1DFA.html |
There was a problem hiding this comment.
Do you mean that in provided link in section Inputs?
|
|
||
| [GENERAL] | ||
| # USER input for running and plotting a comparison of simulation result to reference polygon | ||
| resType = ppr |
# Conflicts: # avaframe/ana6Optimisation/README_ana6.md
awirb
left a comment
There was a problem hiding this comment.
comments for the optimisation part and plotting still missing
| - thresholdValueSimulation | ||
| - modName | ||
| avalancheDir : str | ||
| Directory containing the directory of the reference avalanche |
There was a problem hiding this comment.
do you mean just the path to the avalanche directory?
There was a problem hiding this comment.
Yes, changed the description:)
| cfgAIMEC = cfgUtils.getModuleConfig(ana3AIMEC) | ||
| rasterTransfo, resAnalysisDF, plotDict, _, pathDict = ana3AIMEC.fullAimecAnalysis(avalancheDir, cfgAIMEC) |
There was a problem hiding this comment.
instead of using the passed module, consider loading the config already in the runScript before you call the function and just pass the config, then also the override is easier (using ana3AIMEC_ana3AIMEC_override) or is there a special reason for passing the module?
There was a problem hiding this comment.
passing the module in to the calcArealIndicatorsAndAIMEC function is not necessary, since the AIMEC settings are not overridden, i think that not passing the module and loading config here should be sufficient?
| ) | ||
| raise ValueError(message) | ||
|
|
||
| paramLossSubsetDF = paramLossDF.sort_values(by='Loss', ascending=True)[:N] |
There was a problem hiding this comment.
why is the [:N] needed - is that from start to the end if len(DF) is N no?
There was a problem hiding this comment.
It defines how much of the best ranked morris samples to use for statistics. (e.g. parameter distribution within this topN samples). I changed the name to topN.
|
|
||
| def createDFParameterLoss(df, paramSelected): | ||
| """ | ||
| Create DataFrames linking selected parameters with the loss function. |
There was a problem hiding this comment.
does selected mean - the ones that were used for the parameter variation using the morris sampling method?
There was a problem hiding this comment.
selected depends on scenario: if morris is not run prior, then selected means all parameters that were varied, and if morris is run prior, selected means take only topN most important parameters
| # beta gives penalty if sim. comes short compared to the ref. (FN) | ||
| tverskyAlpha = 2 | ||
| tverskyBeta = 1 | ||
| # Loss function is kombination of TverskyScore * weightTversky + RunoutNormalised * weight runout |
There was a problem hiding this comment.
| # Loss function is kombination of TverskyScore * weightTversky + RunoutNormalised * weight runout | |
| # Loss function is a combination of TverskyScore * weightTversky + RunoutNormalised * weight runout |
There was a problem hiding this comment.
Changed locally
… 2 (1 more important),
…consistent with runOptimisationCfg.ini) and add comments.
| fU.makeADir(outDir) | ||
|
|
||
| # Get config from morris for path to morris results | ||
| cfgDir = 'runMorrisSA.ini' |
There was a problem hiding this comment.
isn't the name runMorrisSACfg.ini?
There was a problem hiding this comment.
it still works like this because it then uses the stem, so runMorrisSA to build the path to the config file, but I think it should be runMorrisSA.py
| - The top-N most influential parameters are selected for optimisation. | ||
|
|
||
| Scenario 2 (Manual definition): | ||
| - No prior Morris screening. |
There was a problem hiding this comment.
If I understood correctly, Morris analysis could have been performed previously to decide which parameters should be considered in the optimisation and which ones do not have a strong effect on the loss function and are therefore not considered? So scenario 2 just means that first simulations have to be performed to start the optimisation with or used from the ana4Prob run?
There was a problem hiding this comment.
so in contrast in scenario 1 the simulations performed for the morris analysis using the morris sampling are used directly?
There was a problem hiding this comment.
Yes, the statement is correct.
| def loadVariationData(cfgOpt, outDir, avaDir): | ||
| """ | ||
| Load parameter bounds and selected parameters for optimisation. Two execution modes are supported, controlled via | ||
| cfgOpt['PARAM_BOUNDS']['scenario']: |
There was a problem hiding this comment.
in the description of the scenarios below, both say that the parameter bounds are either read from sa_parameter_bounds.pkl (scenario 1) or from paramValuesD.pickle created in the runAna4ProbAna (scenario 2) - so cfgOpt['PARAM_BOUNDS'] is not used? consider mentioning already here that this relies on previous simulation runs performed using Morris analysis or ana4ProbAna
There was a problem hiding this comment.
cfgOpt['PARAM_BOUNDS'] is used to determine which file is read, either sa_parameter_bounds.pkl or paramValuesD.pickle
| # 2 scenarios: choose 1 or 2 | ||
| scenario = 1 | ||
| #(1): morris is run prior, then dataframe of ranked input parameters is already saved by runMorris.py as pickle file, user only need to determine how much input paramters to use for optimisation | ||
| topN = 3 |
There was a problem hiding this comment.
also this is just used if scenario 1 using the Morris analysis parameters and there the topN ranked ones right? now I see why the (2) scenario is explained below
| paramBounds, paramSelected = optimisationUtils.loadVariationData(cfgOpt, inDir, avalancheDir) | ||
|
|
||
| # Calculate Areal indicators and AIMEC and save the results in Outputs/ana3AIMEC and Outputs/out1Peak | ||
| optimisationUtils.calcArealIndicatorsAndAimec(cfgOpt, avalancheDir, ana3AIMEC) |
There was a problem hiding this comment.
so for the areal indicators, the settings are read from cfgOpt for aimec from the aimecCfg - here we could use the override config functionality
avaframe/out3Plot/outAna6Plots.py
Outdated
| plt.close(fig) | ||
|
|
||
|
|
||
| def saveBestorCurrentModelrun(finalDF, paramSelected, ei=None, lcb=None, simName=None, csv_path='dummy.csv'): |
There was a problem hiding this comment.
saveBestOrSpecificSimulation ?
…IMECcfg.ini settings and not pass AIMEC module for calcArealIndicatorsAndAIMEC

This PR introduces a new optimisation module
ana6Optimisationforcom8MoTPSAand updates the simulation workflow.The module ana6Optimisation includes:
New files in ana6Opitmisaton:
runMorrisSA.py(configuration:runMorrisSACfg.ini)runPlotMorrisConvergence.py(usesrunMorrisSACfg.ini)runOptimisation.py(configuration:runOptimisationCfg.ini)optimisationUtils.pyREADME_ana6.md(contains usage instructions)New file in out3Plot:
outAna6Plots.pyChanged workflow of runing com8MoTPSA: